Understanding individual human mobility patterns 
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^ ^! Despite their importance for urban planning [1], traffic forecasting [2], and the spread 

CJ ' of biological [3, 4, 5] and mobile viruses [6], our understanding of the basic laws govern- 

o : 

^ ■ ing human motion remains limited thanks to the lack of tools to monitor the time resolved 

\ location of individuals. Here we study the trajectory of 100, 000 anonymized mobile phone 
users whose position is tracked for a six month period. We find that in contrast with the 

Ph. random trajectories predicted by the prevailing Levy flight and random walk models [7], 

^ \ human trajectories show a high degree of temporal and spatial regularity, each individual 

• being characterized by a time independent characteristic length scale and a significant prob- 

^ . ability to return to a few highly frequented locations. After correcting for differences in 

^ ; travel distances and the inherent anisotropy of each trajectory, the individual travel patterns 

^ ■ collapse into a single spatial probability distribution, indicating that despite the diversity of 

\ their travel history, humans follow simple reproducible patterns. This inherent similarity 

> ■ 

■ in travel patterns could impact all phenomena driven by human mobility, from epidemic 
^ ! prevention to emergency response, urban planning and agent based modeling. 

Given the many unknown factors that influence a population's mobility patterns, ranging from 
means of transportation to job and family imposed restrictions and priorities, human trajectories 
are often approximated with various random walk or diffusion models [7, 8]. Indeed, early mea- 
surements on albatrosses, bumblebees, deer and monkeys [9, 10] and more recent ones on marine 
predators [11] suggested that animal trajectory is approximated by a Levy flight [12, 13], a random 
walk whose step size Ar follows a power-law distribution P( Ar) ~ ^r^'y^+P) with (3 <2. While 
the Levy statistics for some animals require further study [14], Brockmann et at. [7] generalized 
this finding to humans, documenting that the distribution of distances between consecutive sight- 
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ings of nearly half-million bank notes is fat tailed. Given that money is carried by individuals, 
bank note dispersal is a proxy for human movement, suggesting that human trajectories are best 
modeled as a continuous time random walk with fat tailed displacements and waiting time dis- 
tributions [7]. A particle following a Levy flight has a significant probability to travel very long 
distances in a single step [12, 13], which appears to be consistent with human travel patterns: most 
of the time we travel only over short distances, between home and work, while occasionally we 
take longer trips. 

Each consecutive sightings of a bank note reflects the composite motion of two or more indi- 
viduals, who owned the bill between two reported sightings. Thus it is not clear if the observed 
distribution reflects the motion of individual users, or some hitero unknown convolution between 
population based heterogeneities and individual human trajectories. Contrary to bank notes, mo- 
bile phones are carried by the same individual during his/her daily routine, offering the best proxy 
to capture individual human trajectories [15, 16, 17, 18, 19]. 

We used two data sets to explore the mobility pattern of individuals. The first (Di) consists of 
the mobility patterns recorded over a six month period for 100, 000 individuals selected randomly 
from a sample of over 6 million anonymized mobile phone users. Each time a user initiates or 
receives a call or SMS, the location of the tower routing the communication is recorded, allowing 
us to reconstruct the user's time resolved trajectory (Figs, la and b). The time between consecutive 
calls follows a bursty pattern [20] (see Fig. SI in the SM), indicating that while most consecutive 
calls are placed soon after a previous call, occasionally there are long periods without any call 
activity. To make sure that the obtained results are not affected by the irregular call pattern, we 
also study a data set {D2) that captures the location of 206 mobile phone users, recorded every two 
hours for an entire week. In both datasets the spatial resolution is determined by the local density 
of the more than 10^ mobile towers, registering movement only when the user moves between 
areas serviced by different towers. The average service area of each tower is approximately 3 km^ 
and over 30% of the towers cover an area of 1 km^ or less. 

To explore the statistical properties of the population's mobility patterns we measured the dis- 
tance between user's positions at consecutive calls, capturing 16, 264, 308 displacements for the 
Di and 10, 407 displacements for the D2 datasets. We find that the distribution of displacements 
over all users is well approximated by a truncated power-law 

P(Ar) = (Ar + Aro)"^exp (-Ar/zt), (1) 
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with (5 = 1.75 ± 0.15, Aro = 1.5 km and cutoff values ^Idi = 400 km, and k,\d2 = 80 km 
(Fig. Ic, see the SM for statistical validation). Note that the observed scaling exponent is not far 
from /3b = 1-59 observed in Ref. [7] for bank note dispersal, suggesting that the two distributions 
may capture the same fundamental mechanism driving human mobility patterns. 

Equation (1) suggests that human motion follows a truncated Levy flight [7]. Yet, the observed 
shape of P(Ar) could be explained by three distinct hypotheses: A. Each individual follows a 
Levy trajectory with jump size distribution given by (1). B. The observed distribution captures a 
population based heterogeneity, corresponding to the inherent differences between individuals. C. 
A population based heterogeneity coexists with individual Levy trajectories, hence (1) represents 
a convolution of hypothesis A and B. 

To distinguish between hypotheses A, B and C we calculated the radius of gyration for each 
user (see Methods), interpreted as the typical distance traveled by user a when observed up to 
time t (Fig. lb). Next, we determined the radius of gyration distribution P{rg) by calculating Vg 
for all users in samples Di and D2, finding that they also can be approximated with a truncated 
power-law 

P(r,) = (r, + r;)-^^exp(-r,/«:), (2) 

with Tg = 5.8 km, Pr = 1-65 ± 0.15 and k, = 350 km (Fig. Id, see SM for statistical validation). 
Levy flights are characterized by a high degree of intrinsic heterogeneity, raising the possibility 
that (2) could emerge from an ensemble of identical agents, each following a Levy trajectory. 
Therefore, we determined P{rg) for an ensemble of agents following a Random Walk (RW), 
Levy-Flight (LF) or Truncated Levy-Flight (TLF) (Figure Id) [8, 12, 13]. We find that an en- 
semble of Levy agents display a significant degree of heterogeneity in r^, yet is not sufficient to 
explain the truncated power law distribution P{rg) exhibited by the mobile phone users. Taken 
together. Figs. Ic and d suggest that the difference in the range of typical mobility patterns of indi- 
viduals (Vg) has a strong impact on the truncated Levy behavior seen in (1), ruling out hypothesis 
A. 

If individual trajectories are described by a LF or TLF, then the radius of gyration should 
increase in time as rg(t) ~ t3/{2+/3) 22] while for a RW Vgit) ~ t^/^. That is, the longer we 
observe a user, the higher the chances that she/he will travel to areas not visited before. To check 
the validity of these predictions we measured the time dependence of the radius of gyration for 
users whose gyration radius would be considered small (rg{T) < 3 km), medium (20 < rg{T) < 
30 km) or large {rg{T) > 100 km) at the end of our observation period (T = 6 months). The 
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results indicate that the time dependence of the average radius of gyration of mobile phone users 
is better approximated by a logarithmic increase, not only a manifestly slower dependence than 
the one predicted by a power law, but one that may appear similar to a saturation process (Fig. 2a 
and Fig. S4). 

In Fig. 2b, we have chosen users with similar asymptotic rg{T) after T = 6 months, and 
measured the jump size distribution P(Ar\rg) for each group. As the inset of Fig. 2b shows, users 
with small Vg travel mostly over small distances, whereas those with large Vg tend to display a 
combination of many small and a few larger jump sizes. Once we rescale the distributions with 
rg (Fig. 2b), we find that the data collapses into a single curve, suggesting that a single jump 
size distribution characterizes all users, independent of their Vg. This indicates that P{Ar\rg) ~ 
r~"F{Ar/rg), where a ~ 1.2 ± 0.1 and F{x) is an independent function with asymptotic 
behavior F{x < 1) ~ x^" and rapidly decreasing for x ^ 1. Therefore the travel patterns 
of individual users may be approximated by a Levy flight up to a distance characterized by r-g. 
Most important, however, is the fact that the individual trajectories are bounded beyond r-g, thus 
large displacements which are the source of the distinct and anomalous nature of Levy flights, 
are statistically absent. To understand the relationship between the different exponents, we note 
that the measured probability distributions are related by P(Ar) = P{Ar\rg)P{rg)drg, which 
suggests (see SM) that up to the leading order we have P = Pr+o; — l, consistent, within error bars, 
with the measured exponents. This indicates that the observed jump size distribution P(Ar) is in 
fact the convolution between the statistics of individual trajectories P{Arg\rg) and the population 
heterogeneity P{rg), consistent with hypothesis C. 

To uncover the mechanism stabilizing Vg we measured the return probability for each indi- 
vidual Fpt{t) [22], defined as the probability that a user returns to the position where it was 
first observed after t hours (Fig. 2c). For a two dimensional random walk Fpt{t) should follow 
~ ln(t)^) [22]. In contrast, we find that the return probability is characterized by several peaks 
at 24 h, 48 h, and 72 h, capturing a strong tendency of humans to return to locations they visited 
before, describing the recurrence and temporal periodicity inherent to human mobility [23, 24]. 

To explore if individuals return to the same location over and over, we ranked each location 
based on the number of times an individual was recorded in its vicinity, such that a location with 
L = 3 represents the third most visited location for the selected individual. We find that the 
probability of finding a user at a location with a given rank L is well approximated by P{L) ~ 
independent of the number of locations visited by the user (Fig. 2d). Therefore people devote most 
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of their time to a few locations, while spending their remaining time in 5 to 50 places, visited with 
diminished regularity. Therefore, the observed logarithmic saturation of rg{t) is rooted in the high 
degree of regularity in their daily travel patterns, captured by the high return probabilities (Fig. 2b) 
to a few highly frequented locations (Fig. 2d). 

An important quantity for modeling human mobility patterns is the probability ^aix, y) to find 
an individual a in a given position (x, y). As it is evident from Fig. lb, individuals live and travel 
in different regions, yet each user can be assigned to a well defined area, defined by home and 
workplace, where she or he can be found most of the time. We can compare the trajectories of 
different users by diagonalizing each trajectory's inertia tensor, providing the probability of finding 
a user in a given position (see Fig. 3a) in the user's intrinsic reference frame (see SM for the 
details). A striking feature of ^{x, y) is its prominent spatial anisotropy in this intrinsic reference 
frame (note the different scales in Fig 3a), and we find that the larger an individual's the more 
pronounced is this anisotropy. To quantify this effect we defined the anisotropy ratio S = Oyja^, 
where Or^ and Oy represent the standard deviation of the trajectory measured in the user's intrinsic 
reference frame (see SM). We find that S decreases monotonically with Vg (Fig. 3c), being well 
approximated with S ~ '', for 77 ^ 0.12. Given the small value of the scaling exponent, other 
functional forms may offer an equally good fit, thus mechanistic models are required to identify if 
this represents a true scaling law, or only a reasonable approximation to the data. 

To compare the trajectories of different users we remove the individual anisotropics, rescal- 
ing each user trajectory with its respective and Oy. The rescaled ^[xj o^.y! Oy) distribution 
(Fig. 3b) is similar for groups of users with considerably different r^, i.e., after the anisotropy and 
the Tg dependence is removed all individuals appear to follow the same universal y) prob- 
ability distribution. This is particularly evident in Fig. 3d, where we show the cross section of 
<i>(x/(Tx-, 0) for the three groups of users, finding that apart from the noise in the data the curves 
are indistinguishable. 

Taken together, our results suggest that the Levy statistics observed in bank note measurements 
capture a convolution of the population heterogeneity (2) and the motion of individual users. Indi- 
viduals display significant regularity, as they return to a few highly frequented locations, like home 
or work. This regularity does not apply to the bank notes: a bill always follows the trajectory of 
its current owner, i.e. dollar bills diffuse, but humans do not. 

The fact that individual trajectories are characterized by the same -independent two dimen- 
sional probability distribution $(x/ax-, y/cr^,) suggests that key statistical characteristics of indi- 
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vidual trajectories are largely indistinguishable after rescaling. Therefore, our results establish the 
basic ingredients of realistic agent based models, requiring us to place users in number propor- 
tional with the population density of a given region and assign each user an r-g taken from the 
observed P{rg) distribution. Using the predicted anisotropic rescaling, combined with the density 
function y), whose shape is provided as Table 1 in the SM, we can obtain the likelihood 
of finding a user in any location. Given the known correlations between spatial proximity and 
social links, our results could help quantify the role of space in network development and evolu- 
tion [25, 26, 27, 28, 29] and improve our understanding of diffusion processes [8, 30]. 
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FIG. 1: Basic human mobility patterns, a, Week-long trajectory of 40 mobile phone users indicate that 
most individuals travel only over short distances, but a few regularly move over hundreds of kilometers. 
Panel b, displays the detailed trajectory of a single user. The different phone towers are shown as green 
dots, and the Voronoi lattice in grey marks the approximate reception area of each tower. The dataset 
studied by us records only the identity of the closest tower to a mobile user, thus we can not identify the 
position of a user within a Voronoi cell. The trajectory of the user shown in b is constructed from 186 
two hourly reports, during which the user visited a total of 12 different locations (tower vicinities). Among 
these, the user is found 96 and 67 occasions in the two most preferred locations, the frequency of visits 
for each location being shown as a vertical bar. The circle represents the radius of gyration centered in 
the trajectory's center of mass, c, Probability density function P{Ar) of travel distances obtained for the 
two studied datasets Di and D2. The solid line indicates a truncated power law whose parameters are 
provided in the text (see Eq. 1). d, The distribution P{rg) of the radius of gyration measured for the users, 
where rg{T) was measured after T = 6 months of observation. The solid line represent a similar truncated 
power law fit (see Eq. 2). The dotted, dashed and dot-dashed curves show P{rg) obtained from the standard 
null models {RW, LF and TLF), where for the TLF we used the same step size distribution as the one 
measured for the mobile phone users. 
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FIG. 2: The bounded nature of human trajectories, a, Radius of gyration, {rg{t)) vs time for mobile 
phone users separated in three groups according to their final rg{T), where T = 6 months. The black curves 
correspond to the analytical predictions for the random walk models, increasing in time as (rg{t))\LF,TLF ~ 
^3/2+/3 (solid), and {rg{t))\imr ~ t^.s (dotted). The dashed curves corresponding to a logarithmic fit of the 
form A+B hi{t), where A and B depend on rg. b, Probability density function of individual travel distances 
P{lS.r\rg) for users with rg = 4, 10, 40, 100 and 200 km. As the inset shows, each group displays a quite 
different P(Ar|rg) distribution. After rescaling the distance and the distribution with rg (main panel), the 
different curves collapse. The solid line (power law) is shown as a guide to the eye. c, Return probability 
distribution, Fpt{t). The prominent peaks capture the tendency of humans to regularly return to the locations 
they visited before, in contrast with the smooth asymptotic behavior ~ ln(t)^) (solid line) predicted for 
random walks, d, A Zipf plot showing the frequency of visiting different locations. The symbols correspond 
to users that have been observed to visit = 5, 10, 30, and 50 different locations. Denoting with (L) the 
rank of the location listed in the order of the visit frequency, the data is well approximated by R{L) ~ L~^. 
The inset is the same plot in linear scale, illustrating that 40% of the time individuals are found at their first 
two preferred locations. 
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FIG. 3: The shape of human trajectories, a, The probability density function ^{x,y) of finding a mobile 
phone user in a location {x, y) in the user's intrinsic reference frame (see SM for details). The three plots, 



from left to right, were generated for 10, 000 users with: rg < 3, 20 < r, 



3 < 30 and Vg > 100 km. The 



trajectories become more anisotropic as rg increases, b, After scaling each position with and ay the 
resulting ^{x/ax,y /(Jy) has approximately the same shape for each group, c, The change in the shape of 
^{x, y) can be quantified calculating the isotropy ratio S = Oyjox as a function of Vg, which decreases as 
(solid line). Error bars represent the standard error, d, 6(x/cj^, 0) representing the x-axis cross 
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section of the rescaled distribution ^{x / ax,y / cry) shown in b. 



